Concurrent Probabilistic Temporal Planning with Policy-Gradients

نویسندگان

  • Douglas Aberdeen
  • Olivier Buffet
چکیده

We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search that attempts to optimise a parameterised policy using gradient ascent. Low memory use, plus the use of function approximation methods, plus factorisation of the policy, allow us to scale to challenging domains. This Factored Policy Gradient (FPG) Planner also attempts to optimise both steps to goal and the probability of success. We compare the FPG planner to other planners on CPTP domains, and on simpler but better studied probabilistic non-temporal domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Policy-Gradient Methods for Planning

Probabilistic temporal planning attempts to find good policies for acting in domains with concurrent durative tasks, multiple uncertain outcomes, and limited resources. These domains are typically modelled as Markov decision problems and solved using dynamic programming methods. This paper demonstrates the application of reinforcement learning — in the form of a policy-gradient method — to thes...

متن کامل

Concurrent Probabilistic Temporal Planning

Probabilistic planning problems are often modeled as Markov decision processes (MDPs), which assume that a single action is executed per decision epoch and that actions take unit time. However, in the real world it is common to execute several actions in parallel, and the durations of these actions may differ. This paper presents efficient methods for solving probabilistic planning problems wit...

متن کامل

Probabilistic Temporal Planning

Planning research has explored the issues that arise when planning with concurrent and durative actions. Separately, planners that can cope with probabilistic effects have also been created. However, few attempts have been made to combine both probabilistic effects and concurrent durative actions into a single planner. The principal one of which we are aware was targeted at a specific domain. W...

متن کامل

Prottle: A Probabilistic Temporal Planner

Planning with concurrent durative actions and probabilistic effects, or probabilistic temporal planning, is a relatively new area of research. The challenge is to replicate the success of modern temporal and probabilistic planners with domains that exhibit an interaction between time and uncertainty. We present a general framework for probabilistic temporal planning in which effects, the time a...

متن کامل

The Factored Policy Gradient planner (IPC-06 Version)

We present the Factored Policy Gradient (FPG) planner: a probabilistic temporal planner designed to scale to large planning domains by applying two significant approximations. Firstly, we use a “direct” policy search in the sense that we attempt to directly optimise a parameterised plan using gradient ascent. Secondly, the policy is factored into a per action mapping from a partial observation ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007